Using just the credit and loyalty card data, identify the most popular locations, and when they are popular. What anomalies do you see? What corrections would you recommend to correct these anomalies? Please limit your answer to 8 images and 300 words.
By observing loyalty and credit card data, we can observe that the top three most popular location during the 2 weeks data are Katerina’s Cafe followed by Hippokampos, and Guy’s Gyros.
As showed in the data table below, both loyalty and credit card data shows a consistent result for the popular location.
Stacked bar below shows which date are the most popular for the these popular place based on loyalty and credit card data, which apparently shows a different results.Based on loyalty card transaction data Katerina’s Cafe was popular on 11 January 2014, with 19 transactions from GASTech employees. Hippokampos has the most transaction on 8 January 2014 and lastly Guy’s Gyros has the most transaction on 15 January 2014. However based on credit card data Katerina’s Cafe is most popular on 6 January 2014, Hippokampos on 16 January 2014 and Guy’s Gyros on 13 January 2014
Interactive bar graph was created below by using credit card dara to observe the patterns of visiting the top three locations in different hours of the day. We can see that most of the GASTech employee visited Katerina’s Cafe and Guy’s Gyros during dinner time, on the other hand Hippokampos is more popular during lunch time, except for weekends.
As we can see above, the number of daily frequency of transactions are different between credit card data and loyalty card data. Further observations was performed and it shows that there are total of 409 un-matched records. These un-matched records might lead to some new clues. Unmatched records can be seen in the table below
To correct this anomaly, we can add vehicle data to our analysis, we might be able to determine some of the transactions based on the employee location.
##Question 2
Based on employee location, we can figure out the timing range for cc transaction in Bean There Done That, Brewed Awakenings, Jack’s Magical Beans. Even though there are some transaction in Jack’s Magical Beans that is not found in location data.
#QUESTION 3 Can you infer the owners of each credit card and loyalty card? What is your evidence? Where are there uncertainties in your method? Where are there uncertainties in the data? Please limit your answer to 8 images and 500 words.
Yes, we can infer the owners of each credit card and loyalty card by cross-checking credit card and vehicle data using map.
#QUESTION 4
gps_sf_stopfin_unique1 <- gps_sf_stopfin %>% select(Possible_Location,geometry)%>%unique()%>%arrange(Possible_Location)
gps_sf_stopfin_unique <- gps_sf_stopfin_unique1[-c(221,476,644,1100,1103),]
gps_path_stopfin <- gps_sf_stopfin_unique %>%
group_by(Possible_Location) %>%
summarize(m_Timestamp = mean(geometry),
do_union = FALSE ) %>%
st_cast("LINESTRING")
Abila_st <- st_read(dsn = "MC2/Geospatial",
layer = "Abila")
Reading layer `Abila' from data source
`C:\jovinkahartanto\assignment_distill\MC2\Geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 3290 features and 9 fields
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: 24.82401 ymin: 36.04502 xmax: 24.90997 ymax: 36.09492
Geodetic CRS: WGS 84
P <- npts(Abila_st, by_feature = TRUE)
abila_st_2 <- cbind(Abila_st,P) %>% filter(P > 1)
bgmap <- raster("MC2/Geospatial/MC2-tourist.TIF")
tmap_mode("view")
tm_shape(bgmap) +
tm_rgb(bgmap, r = 1, g = 2, b = 3,
alpha = NA,
saturation = 1,
interpolate = TRUE,
max.value = 255) +
tm_shape(abila_st_2) +
tm_lines(col = "red", scale = 1)+
tm_shape(gps_path_stopfin[c(50:51,53:55),]) +
tm_lines(col = "blue", scale =5, interactive = TRUE) +
tm_text("Possible_Location", size = 2, remove.overlap = TRUE, overwrite.lines = TRUE, just = "top")
They usually visited the meeting point during lunch time, around 11:30 AM to 12:30 PM. There are also few cases where two or three of them visited these places at the same times.
The date, time, and duration of these 4 employees visiting the 5 unknown places can be seen in the interactive graph below.
# gps_path_selected <- gps_path %>%
# filter(id==28)
#
#
# bgmap <- raster("MC2/Geospatial/MC2-tourist.TIF")
#
# tmap_mode("view")
# tm_shape(bgmap) +
# tm_rgb(bgmap, r = 1, g = 2, b = 3,
# alpha = NA,
# saturation = 1,
# interpolate = TRUE,
# max.value = 255) +
# tm_shape(abila_st_2) +
# tm_lines(col = "red", scale = 1)+
# tm_shape(gps_path_selected[1:4,]) +
# tm_lines(col = "black", scale =4, interactive = TRUE) +
# tm_text("Hour", size = 3, remove.overlap = TRUE, overwrite.lines = TRUE, just = "top")